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ABSTRACT 


In  keeping  with  the  experimental  nature  of  the  Illinois  Pattern 
Recognition  Computer  (illiac  III),  the  arithmetic  units  are  intended 
to  be  a  practical  testing  ground  for  recent  theoretical  work  in  com- 
puter arithmetic.   This  paper  describes  the  use  of  redundant  number 
systems  and  the  design  of  a  structure  with  which  multiplication  and 
division  are  executed  radix  256.   The  heart  of  the  unit  is  the  stored- 
sign  subtracter,  a  recently  discovered  member  of  the  family  of  borrow- 
save  subtracters  and  carry-save  adders.   A  cascade  of  these  subtracters 
controlled  by  a  multiplier  recoder,  provides  multiplication.   The  same 
structure,  controlled  by  a  "model  division"  (a  quotient  recoder), 
performs  division. 


■111- 


ACKNOWLEDGEMENT 

The  author  wishes  to  acknowledge  and  thank  Professor  James  E. 
Robertson,  Professor  Bruce  H.  McCormick  and  Mrs.  Tuh-Kai  Koo  for  their 
assistance  in  the  design  effort  described  in  this  paper.   Mrs.  Koo 
wrote  extensive  simulation  programs  which  were  used  to  validate  the 
arithmetic  algorithms. 


-IV- 


TABLE  OF  CONTENTS 

Page 
INTRODUCTION  1 

Adder-Subtracter  1 

Multiplication  1 

Division  2 

ADDER-SUBTRACTER 3 

Background  3 

Definition  3 

Properties  5 

Input-Output  Compatibility  5 

Limited  Borrow  .  5 

Unique  Zero 5 

Negation 6 

Least  Significant  Digit  7 

Overflow  Detection  8 

Truncation  Error  11 

Sign  Detection 12 

Assimilation  13 

Implementation  Ik 

MULTIPLICATION 17 

Background 17 

Recoding  Scheme  18 

Multiplication  Structure  21 

Brief  Operation  Description 2k 

Truncation  Error  26 


-v- 


Page 
DIVISION 28 

Background 28 

Model  Division 29 

Operational  Description  of  Model  Division  31 

Division  Structure  36 

Brief  Operational  Description  of  Full  Precision 

Division  Scheme 37 

Truncation  Error 38 

REFERENCES 39 

APPENDIX 1+1 

Proof  of  the  Validity  of  the  Correction  Scheme  for 

Bogus  Overflow k2 

Brief  Description  of  Illiac  III  Computer  System  .  .  .  .  U6 


-vi- 


INTRODUCTION 


In  keeping  with  the  experimental  nature  of  the  Illinois 
Pattern  Recognition  Computer  (illiac  III),  the  arithmetic  units  are 
intended  to  be  a  practical  testing  ground  for  some  recent  theoretical 
work  in  computer  arithmetic.   The  hulk  of  this  work  centers  upon  the 
use  of  redundant  number  systems  and/or  the  use  of  higher  radix  methods. 
The  design  of  the  arithmetic  units  of  Illiac  III  exhibits  both  tech- 
niques.  They  are  of  primary  importance  in  the  adder-subtracter  structure 
the  multiplication  structure,  and  the  division  structure. 

Adder-Subtracter 

A  key  factor  in  the  rapid  execution  of  the  iterative 
sequences  of  multiplication  and  division  is  the  operation  time  of  the 
adder-subtracter.   The  design  used  in  Illiac  III  is  a  member  of  a 
family  of  limited  carry-borrow  propagation  adder-subtracters.   The 
necessity  for  propagation  of  carries  or  borrows  is  eliminated  by  permit- 
ting the  results  of  an  operation  to  be  represented  in  a  redundant  form. 
Redundancy  is  achieved  by  using  a  signed-digit  format.   Associated  with 
each  digit  is  a  magnitude  of  either  1  or  0 ,  and  a  sign  of  either  posi- 
tive or  negative   Changing  a  number  in  a  signed-digit  format  to  a 
conventional  non-redundant  representation  requires  a  carry  or  borrow 
propagation,  but  only  one  such  conversion  is  required  per  arithmetic 
operation  and  it  may  be  accelerated  by  use  of  lookahead  techniques. 
The  adder-subtracter  structure  exhibits  several  other  interesting  pro- 
perties not  found  in  the  conventional  carry-save  adder  or  borrow-save 
subtracter. 

Multiplication 

In  other  than  the  adder-subtracter  complex,  high-speed 
operation  is  also  obtained  by  extensive  use  of  redundancy  and  by 
executing  operations  in  radices  greater  than  two.   Multiplication,  for 
example,  is  performed  radix  256,  i.e.  8  bits  of  the  multiplier  are 
retired  in  one  pass  from  the  primary  to  the  secondary  rank  of  the 
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accumulator.   By  recoding,  redundancy  is  introduced  Li 

;uch  a  manner  that  all  tl   required  mult  tip]  Lcand  W 

be  formed  merely  by  shifting. 

Division 

In  division,  redundancy  is  introduced  into  the  representation 
of  the  quotient.   As  a  consequence  quotient  digits  may  be  determined  fj 
a  relatively  few  high-order  bite  of  the  divisor  and  partial  remainder, 
full  precision  comparison  of  the  divisor  and  partial  remainder  is  not 
required.   The  division  algorithm  makes  efficient  ur.     the  large  amount 
of  hardware  devoted  to  high  S]  -ed  multiplication  and  is  also  performed 
radix  25b.   Eight  bits  of  the  quotient  are  generated  in  one  pass  from  * 
primary  to  the  secondary  rank  of  the  accumulator. 

Appenli  x 

The  Appendix  includes  the  proof  of  the  validity  of  the  bogus 
overflow  correction  scheme  and  introduction  to  the  entire  Illiac  III 
.  ,:tem. 
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ADDER- SUBTRACTER 


Background 

It  has  long  been  realized  that  the  execution  of  multipli- 
cation is  substantially  accelerated  by  the  use  of  adders  in  which  carry 
propagation  is  eliminated  until  a  terminal  step.   Recently,  Robertson 
[l]  has  noted  that  the  traditional  carry  save-adder  or  borrow-save 
subtracter  derived  by  the  modification  of  conventional  adders  or 
subtracters  are  but  two  members  of  a  larger  family  of  limited  carry- 
borrow  propagation  adders  -  subtracters.   At  least  two  of  the  designs 
obtained  using  his  deterministic  procedures  appear  to  be  new  and  of 
practical  importance.   They  are  the  stored  sign  adder  and  the  stored 
sign  subtracter.   The  design  properties  of  both  are  similar  and  in  the 
final  analysis  both  are  actually  capable  of  either  addition  or  subtrac- 
tion.  The  stored  sign  subtracter  has  been  implemented  in  Illiac  III. 
This  device  is  also  referred  to  as  a  signed-digit  subtracter.   The 
two  names  will  be  used  interchangeably  in  this  paper. 

Definition 

A  typical  position  of  a  signed-digit  subtracter  is  shown  in 
Figure  1.   Each  position  is  a  three-input,  two-output  device  together 
with  an  interpositional  connection  and  a  "NEG"  control  line.   The 
symbol  Y.  represents  the  ith  bit  of  the  subtrahend  (minuend  -  subtrahend 
=  difference) in  conventional  binary  form*.  S.  and  X.  together  comprise 
the  ith  minuend  digit  in  a  redundant  format.   X.  is  interpreted  as  a 
magnitude,  either  1  or  0 ,  and  S.  as  a  sign;  0  is  positive  and  1  is 
negative.   The  digital  values  1,  0,  or  l  (overbar  denotes  negation)  are 
thus  represented  as  follows: 


*The  design  described  here  employes  one  operand  in  conventional 
form  and  one  in  redundant  form.   Designs  have  been  proposed  in  which 
both  operands  are  represented  redundantly.   See  Rohatsh  [2]  and  Borovec  [3] 
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Subtrahend  Minuend 


Yi 


Si 


C;_ 


i-1 


-^ 

4 

POSITION  i 

4 

^ 

Ti 


Difference 


S.  =  sign  of  minuend  digit 

X.  =  magnitude  of  minuend  digit 

Y.  =  subtrahend  in  conventional  binary  form 

T.  =  sign  of  difference  digit 


Z.  =  magnitude  of  difference  digit 


NEG  =  control  to  complement  T. 

If  NEG  =  1  then  T.  is  complemented,  else  not 

G  =  gate  on  interpositional  connections 


C.  =  interpositional  connection 


T.  =  C.  ©  NEG 
l      l 

Z.  =  C.  9   X.  ©  Y. 
i      ill 


ci-i  =  (si  xi  v  xi  V    G 
ci  =    (si+i  xi+i  v  xi+i  Vi> 


NEG 

G 
Ci 


Figure  1  -  Typical  Position  of  a  Signed-Digit  Subtracter 
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0  0  +0 

0  1  +1 

10  -0 

11  -1 


The  logical  equations  for  a  stored  sign  adder  may  be  derived  by  changing 
the  sign  of  all  non-zero  digits  in  a  truth  table  for  the  equations 
given  in  Figure  1. 

The  gate  signal,  G,.  shown  in  Figure  1  is  not  inherent  in 
the  logical  design  of  a  stored  sign  adder  or  subtracter  but  is  necessary 
for  a  particular  application  in  Illiac  III.   During  the  assimilation 
of  a  redundantly  represented  result  into  conventional  form  the  require- 
ment arises  for  the  Z  output  to  be  identical  to  the  X  input.   In  general, 
the  addition  or  subtraction  of  zero  will  not  guarantee  this.   However, 
with  G=0 ,  all  interpositional  inputs,  C. ,  are  0,  and  thus  Z.  =  X.  ©Y.  ; 
the  subtracter  will  perform  the  exclusive  or  function  with  G=0.   Further- 
more if  all  Y.  are  also  0,  then  Z.=X. .   The  signal  G  will  always  be  1 
i  11 

whenever  the  device  is  actually  being  used  for  addition  or  subtraction. 

Properties 

Input-Output  Compatibility  -  An  important  property  of  the 
subtracter  in  the  execution  of  iterative  operations  such  as  multipli- 
cation is  the  fact  that  the  output  is  in  this  same  signed-digit  format 
as  the  input.   Z.  is  the  magnitude  and  T.  is  the  sign  of  the  ith  digit 
of  the  output . 

Limited  Borrow  -  The  introduction  of  redundancy  in  the  output 
of  the  subtracter  has  permitted  the  length  of  the  borrow  propagation 
chain  to  be  drastically  limited.   The  interpositional  connection,  C, 
is  a  function  of  only  the  inputs  to  the  adjacent  position,  i+1.   It  is 
not  a  propagating  borrow. 

Unique  Zero  -  Note  that  although  the  representation  is 
redundant,  the  representation  of  zero  is  unique  except  for  sign.   A 


-5- 


number  in  the  signed-digit  J  Jl  magnit i 

bits  are  zero.   For  a  signed-digit  representation  in  radix  r,  t:. 
requirement  for  a  unique  representation  of  zero  demands 
magnitude  of  allowed  digit  values  not  exceed  r-1. 

Negation  -  Another  property  of  this  logical  structure  is  t 
ability  to  algebraically  negate  a  number  in  sign'        format  I     ely 
logically  complementing  all  the  sign  bits.   There  is  no  analogous 
property  for  the  conventional  carry-save  adder  or  borrow-save  subtracter. 
This  feature  of  the  signed-digit  subtracter  permits  additions  and  sub- 
tractions in  a  cascade  of  such  devices  to  be  interleaved  in  any  manner 

desired. 

In  Figure  1,  NEG  is  a  control  signal  which  when  set  to 
logical  1  complements  T.  and  when  set  to  logical  )  allows  T±   to  pass 
unchanged.   Now  consider  a  subtracter  consisting  of  adjacent,  inter- 
connected positions  such  as  shown  in  Figure  1  and  let 

Y  =  the  algebraic  value  of  the  subtrahend  in  conventional  form; 

X*  =  the  algebraic  value  of  the  minuend  in  signed-digit  form,  and 

Z*  =  the  algebraic  value  of  the  difference. 

With  NEG  =  '0'  the  device  is  truly  a  subtracter  and  Z*  =  X*-Y.   With 
NEG  =  '1!  the  output  is  negated  and  thus  Z*  =  -(X*-Y).   Now  note  that 
if  complementing  circuits  are  added  to  the  S.  input  so  that  both  X* 
and  Z*  may  be  independently  negated  it  is  possible  to  form  Z*  =  -(-X*-Y) 
=  X*  +  Y  and  the  device  is  adding.   For  many  applications  the  negating 
circuits  for  the  sign  bits,  S.,  need  not  be  included  in  the  subtracter 
per  se  but  rather  the  same  result  is  achieved  by  gating  the  complement 
outputs  of  the  register  containing  S,  or  when  the  subtracters  are 
cascaded,  by  negating  the  output  of  the  previous  stage. 

This  ability  to  negate  a  result  while  it  is  still  in  a 
redundant  form  also  expedites  the  execution  of  floating  point  addition 
and  subtraction.   In  the  floating  point  format  adopted  for  Illiac  III 
the  mantissa  is  considered  to  be  positive,  i.e.  to  be  a  magnitude. 
The  sign  is  given  by  a  bit  apart  from  the  mantissa.   In  multiplication 
and  division  the  sign  of  the  result  if  the  exclusive  OR  of  the  signs  of 
the  operands.   In  addition  and  subtraction  the  sign  determination  is 
more  complicated:   it  depends  upon  the  signs  and  the  relative  magnitude 
of  the  operands. 
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Consider  two  operands  with  magnitudes  A  and  B  and  with  signs 
SIGNA  and  SIGNB,  respectively.   A  logical  one  denotes  a  negative 
quantity;  a  logical  zero  denotss  a  positive  quantity.   The  table  below 
gives  the  sign  of  the  result  as  a  function  of  the  sign  of  the  operands 
and  their  relative  magnitude: 


SIGN(A+B) 

SIGN(A-B) 

SIGNA 

SIGNE 
0 

A>B 

AiB 

A:  B     A*B 

0 

0 

0 

.  0          1 

0 

1 

0 

1 

0          0 

I 

0 

1 

0 

1          0 

1 

1 

1 

1 

1          0 

If  the  exponents  of  the  operands  are  different,  then  the  relative 
magnitude  is  readily  determined  from  the  difference  of  the  exponents. 
But  if  the  exponents  are  equal  and  SIGNA  ±   SIGNB  for  addition,  or  SIGNA 
=  SIGNB  for  subtraction  then  the  sign  of  the  result  cannot  be  determined 
prior  to  actually  performing  the  operation. 

First  consider  the  cases  in  which  the  sign  of  the  result 
maybe  be  determined.   If  the  sign  is  known  to  be  negative  the  result 
is  negated  prior  to  the  conversion  to  a  conventional  form.   The  ability 
to  negate  the  redundant  form  of  the  result  permits  this.   In  cases 
in  which  the  sign  is  not  known  prior  to  calculation,  the  sign  of  the 
result  is  assumed  to  be  positive,  the  operation  is  performed  and  then 
converted  into  a  conventional  form.   The  high  order  bit  of  the  conver- 
ted result  is  the  sign.   If  it  is  negative  then  the  redundant  result 
(still  present  on  the  outputs  of  the  subtracter)  is  negated  and  then 
again  converted  to  a  conventional  form.   The  necessity  for  two 
conversions  would  be  avoided  if  the  sign  of  the  result  could  be  deter- 
mined from  the  redundant  form.   However,  as  discussed  in  the  next 
section,  sign  determination  is  complicated  by  use  of  redundant  notation. 
The  logic  required  is  of  the  same  order  of  complexity  as  that  required 
to  convert  the  redundant  result  to  a  conventional  form  in  which  the  sign 
is  apparent. 

Least  Significant  Digit  -  A  basic  property  of  a  stored  sign 
adder  or  subtracter  is  that  the  position  of  the  least  significant  digit 
need  not  be  known.   A  conventional  adder  used  only  for  addition  does 
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not  required  the  insertion  of  a  carry  into  the  least  signifies 
digital  position.   Similarly,  a  subtracter  used  only  for  subtroctic 
does  not  require  borrow  insertion.   Hence,  the  combination  adder- 
subtracter  does  not  require  insertion  of  a  carry  during  addition  or  a 
borrow  during  subtraction. 

Since  there  is  no  requirement  for  a  carry  or  borrow 
insertion  in  the  least  significant  position,  a  signed-digit  subtrac- 
ter of  a  given  length  may  be  partitioned  into  several  subtracters  of 
smaller  length.   Furthermore,  by  suitably  partitioning  the  NEG  con- 
trol signal,  addition  could  be  performed  in  some  group,  while  subtrac- 
tion occurs  in  others.   This  facility  is  of  application  in  variable 
length  operand  formats  and  for  parallel  vector  arithmetic.   Although 
neither  of  these  are  available  in  the  initial  version  of  the  Illiac 
III  arithmetic  units  the  potential  usefulness  of  vector  operations 
influenced  the  decision  to  implement  a  signed-digit  subtracter. 
Vector  facilities  could  be  included  in  a  subsequent  version  of  the 
arithmetic  unit  without  major  modifications.   A  very  limited  use  of 
this  facility  is  being  made  in  performing  integer  division.   To  be 
compatible  with  the  floating  point  division  algorithm  an  integer 
divisor  or  dividend  in  two's  complement  negative  form  is  converted  to 
sign-magnitude  form  during  a  preliminary  step.   In  performing  this 
conversion  a  6U-bit  signed  digit  subtracter  is  used  as  two,  32-bit 
subtracters . 

Overflow  Detection 

For  redundant  representations  it  is  possible  to  derive 
sufficient  but  not  necessary  conditions  for  overflow  detection. 


Let    Z*  =  Z*  +  .1.      Z*  2_1 

0    i=l    l 


with  the  constraint  -1  <  Z*  <  1.   Inspection  of  Z*  and  Z*  gives  rise 
to  three  possible  range  conditions:   overflow,  no  overflow,  or  maybe 
overflow.   The  later  conditions  means  that  overflow  may  or  may  not 
occur  on  assimilation  to  conventional  form.   Table  1  defines  the 
range  of  Z*  for  all  possible  combinations  of  Z*  and  Z*   In  Illiac 
III  overflow  is  checked  only  after  the  result  has  been  converted  to 
conventional  form.   There  are  sufficient  subtracter  positions  to  the 
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left,  of  the  radix  to  insure  that  no  high     P  digit         t. 

But  in  using  redundant  representations  it 
obvious  what  constitutes  a  "sufficient"  number  of  subtracter  posit  i'. 
The  decision  is  complicated  by  the  fact  that  although  the  algebraic 
value  of  a  redundantly  represented  number  may  be  within  the  range  of, 

say,  n  non-redundant  digits,  the  actual  form  of  the  redundantly 
represented  number  requires  more  than  n  digits.   This  point  is 
illustrated  by  an  example. 

Consider  an  8  bit  integer  represented  in  a  conventional 
binary  format.   If  I  denotes  this  integer  then  the  allowable  positive 
range  of  I  is  0  £_  I  <_  255.  Conversely,  given  a  conventionally  repre- 
sented binary  integer,  I,  in  the  range  0  <_  I  ^_   255,  an  8-bit  register 
should  be  adequate  to  hold  I.   Now  let  I*  be  a  signed-digit  version  of 
I.   We  must  now  assign  two  bits  per  digital  position  of  our  8  digit 
register;  one  for  the  sign  and  one  for  the  magnitude.   The  term  "digit" 
will  now  refer  to  one  of  these  sign-magnitude  positions.   The  tempta- 
tion is  to  reason  as  follows:   Due  to  the  range  restrictions  imposed 
on  I ,  it  may  be  stored  in  an  8-bit  register.   Every  I*  is  equivalent 
in  value  to  an  I,  therefore  I*  may  be  stored  in  an  8  digit  register. 
This  reasoning  may  well  be  incorrect  as  illustrated  by  the  following 
specific  example. 

Let  I  =  10000000.  =   128 

and  Let  I*=  10000000.  =   128 

Although  both  I  and  I*  are  equivalent  in  value  and  both  are  in  the 
range  0  to  255,  I*  is  in  a  form  requiring  9  digits.   This  behavior 
gives  rise  to  a  condition  we  shall  call  bogus  overflow.   The  essence 
of  the  problem  is  the  fact  that  a  signed-digit  subtracter  or  adder 
will  sometimes  transform  a  bit  pattern  of  01  into  11  or  a  pattern  of  01 
into  11. 

One  method  of  coping  with  bogus  overflow  is  to  provide 
auxiliary  register  positions.   It  may  be  shown  that  if  I*   and  I* 
are  both  represented  within  n  digits  or  less,  the  sum  or  difference  of 
I*  and  I*  is  representable  within  n+1  digits.   Note  however,  that  when 
repetitive  additions  or  subtractions  are  performed  (even  addition  or 
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subtraction  of  zero)  each  operation  may  generate  another  non-zero  digit 
to  the  left.   Once  bogus  overflow  begins  it  tends  to  propagate  leftward 
unless  corrected.   The  implementation  of  positions  to  store  the  bogus 
overflow  not  only  adds  hardware  costs  to  the  subtracters  and  registers, 
it  also  burdents  the  assimilation  logic.   Although  the  assimilated  number 
will  be  contained  in  only  n  digits,  the  assimilation  logic  must  propagate 
borrows  across  n+k  digits  of  the  redundant  form  of  the  number.   The  maximum 
value  of  k  is  the  number  of  additions  or  subtractions  which  take  place  prior 
to  assimilation.   But  fortunately  a  procedure  is  available  to  control  bogus 
overflow.   We  shall  first  state  the  procedure  and  then  prove  that  it  is  valid, 

Statement : 


Consider  the  high-order  byte  of  an  Illiac  III  signed-digit  sub- 
tracter.  The  positions  are  numbered  1  through  8.   The  radix  point  is  to 
the  right  of  position  8.   If  the  inputs  to  position  1  are  such  that  S  X  =  1 
then  a  bogus  overflow  will  occur.   Without  implementing  the  Oth  position,  it 
may  be  corrected  by  complementing  the  sign  of  the  result  of  position  1,  i.e. 
by  replacing  T  by  T  . 

Proof: 

The  proof  is  presented  in  the  Appendix. 


Truncation  Error 

n 
Let  Z*  =  I  Z*   2_1 

i=l    X 

The  first  column  of  Table  2  gives  the  possible  digital  values  of 
Z*  for  the  output  of  a  signed  digit  subtracter  or  adder,  for  the  output  of 
a  conventional  carry-save  adder,  and  for  the  output  of  the  conventional 
borrow  save  subtracter,  all  of  length  n  to  the  right  of  the  radix  point. 
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Signed-Digit 

Conventional 
Carry- Save 


Possible  Values 


due  to  truncation  to 


of  Z* 

the  rifrjnt   of   position   l*e 

i 

1,  o,   1 

Let   t   =     2-e-2'n 

-T        <A£       T 

0,    1.    2 

0<A<      t 

Conventional 
Borrow- Save 


1,  0,  1 


-T    <A<    T 


Table  2  -  Comparison  of  Digital  Values 
and  Truncation  Error 


For  the  signed-digit  subtracter  and  the  conventional  borrow- 
save  subtracter  the  symmetry  of  the  digital  values  gives  rise  to  a 
symmetric  truncation  error.   As  described  in  the  next  section,  this 
property  tends  to   improve  the  precision  of  the  results  of  floating 
point  multiplication. 

Sign  Detection 

Let  Z*  be  the  algebraic  value  of  a  number  in  the  signed 
digit  format,  i.e. 


Z*  =.E, 

1=1 


Z*  •   2 
i 


-l 


where 


Z*    e  {1,0,1}. 
i 


The  sign  of  Z  is  the  sign  of  the  highest  order,  non-zero  digit. 
Unlike  the  sign  in  a  non-redundant  system,  the  sign  of  a  number  in 
signed-digit  format  is  not  readily  available. 


Let 


Z*  =  .E_  (1-2S..  )X.  Z" 
i=l     l   l 


where  S.  and  X.  e    {0,1} 
i      l 
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Now  treating  S^  and  Xj_  as  Boolean  values,  the  sign  of  Z  is  given 
by  the  following: 


SIGNZ  =   S1  Xx  v   S2  X2  X±     v  S3  X  X±   X£  v 


S   X   X   X   .  .  .X 
v  n   n   1   2     n-1 


If  SIGNZ  =  0  then  Z*  is  positive  and  if  SIGNZ  =  1  then  Z*  is 
negative.   If  Z*  =  0  then  SIGNZ  =  0.   The  implementation  of  the 
equation  for  SIGNZ  becomes  very  expensive  for  large  n.   In  Illiac  III 
sign  determination  is  made  only  after  a  result  has  been  assimilated 
into  a  non-redundant  form. 

Assimilation 

Although  arithmetic  operations  are  computed  in  the 
redundant  signed-digit  format  they  are  eventually  converted  into 
a  conventional  form,  i.e.,  the  sign  bits  are  assimilated.   A 
negative  number  will  be  represented  in  two's  complement.   The 
requirement  is  that  the  redundantly  represented  number, 

n 
Z*  =  Z*  +  I        Z*   2"1 
°   i=l    X 

with  Z*  e  {1,0,1}  to  be  converted  to  a  conventional  notation  of 
i 

the  form 

n 
A  =  -2A   +  A  +  E    A.  2_1     with 
-1     °   i=l     X 

Ai  e   {0,1} 

such  that  A  =  Z*.   A    is  the  sign  bit.   This  conversion  requires 
a  borrow  propagation  followed  by  an  exclusive  OR  operation. 

Let  Z.  and  T.  be  the  magnitude  and  sign  bit,  respectively 
11 

of  the  digit  Z*.   The  propagation  logic  produces  borrow  bits,  B  , 
defined  as  follows: 
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B.  ,   =   B.   Z.   v   T.   Z. 
l-l      11       11 


where  i  =  n,  n-1  ...,  0;  B  =0.   The  assimilated  result,  A,  is 

n 


produced  by  evaluating 


A.  =  Z.  ©  B. 
ill 


for  i=0,  1,  .  ..,  n.   A   ,  the  sign,  equals  B   .   Note  that  the 

recursive  definition  of  B.  is  essentially  an  evaluation  of  the 

l 

following: 


(Borrow  from  i-1  position)  =  (Borrow  from  i  position) (Z*  =  0) 

v(Z*  =  1) 


In  actual  practice  a  signed-digit  subtracter  may  be 

used  to  perform  the  second  step  of  assimilation;  the  formation 

of  Z.   ©  B..   Recall  from  Figure  1  that 
li 


Z.  =  C.  ©  X.  ©  Y.  . 
1111 

If  X.  =  Z.,  and  Y.  =  B.  and  C.  =  0,  then  Z.  =  A. .   Since  C.  = 
11111  11  l 

(Si  +  1  Xi+1  v  X.  ,  Y.   )  G,    C.  may  be  force  to  0  by  setting  G=0. 
l+l  l+l        l 

In  Illiac  III  the  formation  of  the  B.  bits  has  been 

i 

accelerated  by  use  of  lookahead  techniques.   The  B.  bits  for  i 
equal  1  to  6U  are  formed  in  10  collector  delays . 


Implement at  i  on 

Figure  2  illustrates  the  logic  of  one  position  of  the 

signed-digit  subtracter.   The  logic  symbols  conform  to  MIL-STD-806B. 

The  AND  gates  are  implemented  with  diodes;  the  NOR  gates  are  DTL. 

The  operating  sequence  for  two  adjacent  positions,  i  and  i+1,  is 

as  follows : 

1.   COUT.,..   and  its  complement  are  formed  from  S.._ 
l+l  ^  i+l 5 

X.     and  G.in  in  one  collector  delay.   Note  that  COUT  .  ,  =  CIN. . 
l+l,      l+l  J  l+l      i 
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2.  T.  and  Z.  are  formed  in  one  additional  collector  d< 

11 

3.  The  complements  of  T.  and  Z,  are  formed  in  one 
collector  delay.   The  complements  are  necessary  as  inputs  to  the 
next  subtracter  in  the  cascade. 


Using  this  logic  parallel  addition  or  subtraction  takes 
place  in  three  collector  delays.   A  block  diagram  of  the  entire 
adder-subtracter  complex  is  shown  in  the  section  describing 
multiplication.   It  consists  of  a  cascade  of  four  subtracters, 
each  6k   positions  wide. 
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MULTIPLICATION 


Background 

Multiplication  in  a  digital  arithmetic  unit  is  generally 
accomplished  by  over-and-over  addition  of  multiples  of  the  multi- 
plicand with  the  contents  of  an  accumulator.   One  way  to  accelerate 
the  execution  of  multiplication  is  to  decrease  the  time  required  to 
add  the  multiplicand  to  the  partial  product.   The  efficacy  of  a 
reduced  add  time  is  the  primary  motivation  for  the  use  of  a 
borrow-save  device  such  as  the  signed-digit  subtracter.   Another 
technique  for  accelerating  the  execution  of  multiplication  is  to 
accommodate  more  than  one  bit  of  the  multiplier  per  iteration. 
Such  a  scheme  may  be  viewed  as  multiplication  in  radix  r,  where 
r  =  2V,  with  k  equal  the  number  of  bits  inspected  per  iteration. 

While  use  of  a  higher  radix  has  the  advantage  of  reducing 
the  number  of  iterations  by  a  factor  of  k  over  the  binary  case, 
it  has  the  disadvantage  of  requireing  additional  multiples  of  the 
multiplicand.   For  a  non-redundant  number  system,  multiplication 
radix  r,  requires  the  multiples  0,  1,  2,...,  (r-l)  times  the 
multiplicand.   If,  however,  a  redundant  number  system  is  adopted 
then  the  multiples  0,  1,  2,  ...,  (r-l)  may  be  transformed  into  the 
multiples  -r/2,  (-r/2  -l),...,  0,  1,...,  r/2  (for  even  radices). 
In  this  new  set  of  multiples,  half  of  the  members  are  merely  the 
complement  of  the  others.   For  the  specific  case  r  =  h,   the  set 
{0,  1,  2,  3}  may  be  replaced  by  the  set  {2~,  1,  0,  1,  2}.   Note 
that  in  fact  we  do  have  redundancy  in  the  second  set ,  since  there 
are  more  than  r  (in  this  case  five)  digit  symbols.   The  multiple  of 
3  in  the  first  set  is  awkward  or  costly  to  form,  but  in  the  second 
set  all  multiples  may  be  formed  by  shifting  and  complementation. 

It  is  useful  to  view  this  transformation  as  a  recoding 
of  groups  of  k  bits  of  the  multiplier  represented  in  conventional 
form  into  digits  belonging  to  the  redundant  set  in  such  a  manner 
that  algebraic  equivalence  is  maintained.   Additional  information 
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on  the  theory  of  multiplier  recoding  may  be  found  in  references 
[h]   and  [5].   Parts  of  these  works  are  concerned  with  recodings 
which  permit  the  probability  of  a  0  digit  to  be  high.   This 
property  is  important  in  an  implementation  in  which  an  adder 
is  bypassed  if  a  multiple  of  0  is  selected.   In  Illiac  III, 
however,  this  property  is  not  stressed  since  the  addition  time 
is  at  least  as  fast  as  a  bypass. 

Recoding  Scheme 

The  recoding  scheme  adopted  for  Illiac  III  was  suggested 

by  Wallace  [6].   It  is  first  defined  for  a  radix  k   but  will  be 

extended  to  a  radix  256.   The  recoding  actually  requires  the 

parallel  inspection  of  three  bits  of  the  multiplier.   If  X.  is 

the  low-order  bits  of  the  multiplier,  then  the  bits  inspected  are 

X.  , ,  X. ,  and  X.  , .   The  bit  X.  ,  is  an  extra  position  at  the 
l-l   1       l+l  l+l 

right  of  the  least  significant  bit  of  the  multiplier.   It  is 
initially  0,  but  after  the  first  right  shift  of  the  multiplier  it 
will  equal  the  previous  X.   ,  which  may  not  be  0.   In  a  sense, 
X    is  the  indicator  of  what  "mistake"  was  made  on  the  previous 
cycle.   The  recoding  is  shown  in  Table  3.  It  will  accommodate 
a  negative  number  in  two's  complement  representation. 


v 

i-1 ^i yvi+l Recoded  Digit  /Multiple   Selected 


Oil  +2 

0  10  +1 
0 
0 
1 
1 
1 

1  0 


X. 

1 

X.    , 

1+1 

1 

1 

1 

0 

0 

1 

0 

0 

1 

1 

1 

0 

0 

1 

0 

0 

TABLE  3  -  Multiplier  Recoding  Scheme 
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The  Wallace  recoding  scheme  has  the  following 
advantages : 

1.  It  requires  little  logic. 

2.  All  selections  can  be  made  simultaneously;  the 
recoding  is  not  a  serial  process. 

3.  The  multiples  used  can  be  obtained  from  the 
multiplicand  by  the  processes  of  complementation  and  displacement. 

h.      It  applies  without  alteration  to  the  leftmost 
digits  of  the  multiplier. 

Multiplication  has  been  further  accelerated  by  cascading 
four  signed-digit  subtracters  between  the  primary  and  secondary 
ranks  of  the  accumulator.   A  radix  h   multiplication  takes  place 
at  each  subtracter:   the  result  is  a  radix  256  multiplication  for 
a  complete  pass.   Eight  bits  of  the  multiplier  are  retired  per 
iteration.   The  motivation  for  cascading  subtracters  is  demonstrated 
by  the  following: 


Let 


t   =  the  time  required  to  execute  the  iterative  part 
m 

of  the  multiplication, 

t   =  the  time  required  to  add  or  subtract; 
a 

t   =  the  summation  of  the  following  times:   time 

to  load  the  secondary  accumulator, 

propagation  time  through  the  shift  gates  into 

the  primary  accumulator, 

time  to  load  the  primary  accumulator, 

propagation  time  through  the  gates  on  the  output 

of  the  primary  accumulator,  control  overhead  time; 

n   =  the  number  of  additions, 
a 

n   =  the  number  of  shifts . 
s 


Thus 


t   =  n  t   +   n   t 
m      a  a       s   s 

i 
If  N  is  the  number  of  bits  in  the  multiplier,  r  is  the 

radix  of  the  multiplication  performed  at  each  subtracter  and  K  is 

the  number  of  subtracters  in  cascade  then, 
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n  = 

a   log  2 


n 
a 

s   K 


t   =  T— *-T    (t  ♦  ^  ) 


'm   log   r' 


a    K 


The  radix  of  the  multiplication  from  accumulator  to 
accumulator  is  given  by  r  =  2   .   For  the  Illiac  III  implementation, 
t  =  3  delays,  t  =  8  delays,  N  =  56 ,  and  r'  =  k.      The  table  below 

3,  S 

gives  t   for  K  =  1  to  6. 
m 

K  t   (collector  delays)       Percent  Decrease 

m 


1 

308 

0 

2 

196 

36 

3 

158 

kg 

h 

lUo 

55 

5 

129 

58 

6 

121 

61 

Increasing  the  number  of  subtracters  decreases  t  ,  but 

m 

by  a  decreasing  amount.   The  36%   decrease  in  t   for  doubling  the 

m 

number  of  subtracters  is  substantial.   The  55%  decrease  for 
quadrupling  the  subtracters  is  less  impressive  but  was  nevertheless 
deemed  justifiable  in  light  of  the  anticipated  high  demand  for 
multiplications.   The  following  factors  also  contributed  to  this 
decision: 

1.  A  radix  256  structure  is  highly  compatible  with  byte 
oriented  data  formats. 

2.  Control  complexity  and  overhead  is  decreased. 

3.  The  structure  can  be  used  to  accelerate  division 

and  thus  the  cost  is  amortized  across  both  operations 
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Multiplication  Structure 

Figure  3  is  a  block  diagram  of  an  Illiac  III  arithmetic 
unit.   The  conventions  used  in  this  figure  are  as  follows: 

1)  Functional  sub-blocks  are  denoted  by  rectangles. 
Inside  each  box  is  the  name  of  the  block  followed  by  a 
list  of  the  names  of  signals  which  control  it. 

2)  The  lines  between  boxes  denote  data  buses. 

3)  Selector  signal  names  are  of  the  form  F  X  T,  where  F 
is  the  name  of  the  register  from  which  the  data  is 
transferred,  and  T  is  the  name  of  the  register  to  which 
the  data  is  transferred. 

X  =  D  if  the  transfer  is  direct ,  i.e.  without  shifting. 
X  =  Rn  if  the  data  is  shifted  n  places  to  the  right 

during  the  transfer. 
X  =  Ln  if  the  data  is  shifted  n  places  to  the  left 

during  the  transfer. 
h)     A  register  name  standing  alone,  for  example,  UQ,  denotes 
the  true  output  of  all  positions  of  the  register.   A 
subsection  of  a  register  is  specified  in  the  following  form: 
<register  name>   np , 

where  n  is  the  number  of  the  first  byte  (8  bits  per  byte) 
of  the  subsection  and  p  is  the  number  of  the  last  byte  of 
the  subsection.  Byte  numbering  is  0  through  "J.  Example: 
VDUHU7  means  V-BUS  Direct  to  UH-Register,  bytes  h   through  7- 

5)  If  R  denotes  the  name  of  a  register,  then  RSEL  denotes 
the  output  of  the  associated  input  selector. 

6)  If  R  denotes  the  name  of  a  register,  then  LDR  denotes 
the  signal  which  loads  the  output  of  the  associated 
selector  into  the  register  flip-flops. 

7)  All  selectors,  registers,  subtracters  and  shift  gates 
are  6h   bits  (8  bytes)  wide,  except  for  the  M-Register 
which  is  56  bits  wide. 

8)  The  signed  digit  subtracters  are  denoted  SDS1  through 
SDSU. 
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One  multiplication  cycle  consists  of  a  sequence  of  four, 
radix  k   multiplications:   mule ipli cat ion  is  perform  radix  256. 
Nine  bits  of  the  multiplier  stored  in  the  UQ  Register  are  recoded 
simultaneously  to  control  the  gates  of  the  M  Shift  Array  and  the 
NEG  signals  of  the  signed-digit  subtracters  which  determine  whether 
addition  or  subtraction  is  performed. 

The  shifters  are  all  logically  identical,  however, 
they  are  connected  to  the  appropriate  subtracter  so  that ,  with 
respect  to  the  radix  point  of  the  subtracters  (between  the  first 
and  second  byte^,  the  values  of  the  multiples  are  as  shown  below: 


SDS  No. 


Multiples  Selected 


0,  +128,  +6U 

0,  +32,  +16 

0,  +8,  +h 

0,  +2,  +1 


The  recoding  is  performed  in  three  bit,  overlapping 
groups  according  to  the  specifications  in  Table  2.   Figure  h 
illustrates  the  low-order  byte  plus  the  extra  right-most  bit  of  the 
UQ  register  and  the  shift  gates  each  control. 


UQ  BIT  NO. 


produces 
signals: 


57 

58 

59 

60 

61 

62  63 

64 

65 

L 


J 


r  i  i 


ML7YI    ML5Y2  ML3Y3  MLIY4 
ML6YI    ML4Y2  ML2Y3  MDY4 


FIGURE  4.  MULTIPLIER  BIT,  SHIFT  GATE  CORRESPONDENCE. 
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The  logic  equations  actually  implemented  in  Multiplier 
Recode  box  are  shown  below.   The  MYNEG  (Multiply  Negation)  signals 
are  used  to  set  the  NEG  controls  of  the  SDS  to  select  whether  the 
multiple  is  added  or  subtracted. 


ML7Y1  =  (UQ5TUQ58UQ59)v(UQ5TUQ58UQ59) 


ML6U1  =  UQcQ®UQcn 
5o   59 

ML5Y2  =  (^Q59UQ6oUQ6l)v(UQ59UQ6oUQ6l) 

MLUY2  =  UQ6q®UQ6 

ML3Y3  =  (UQ6lUQ62UQ63)v(UQ6lUQ62UQ63) 

ML2Y3  =  UQ62®UQ63 

ML1YU  =  (^Q63UQ61|UQ65)v(UQ63UQ6UUQ65) 

MDYk     =  UQ.,  ®UQ^ 

64   Op 


NEGO  =  UQ 

NEG1  =  UQ__®UQ__ 
p  (        ?9 


NEG2  =  UQ  ®UQ6l 


NEG3  =  UQ6l®UQ63 
NEGU   =  UQg 


Brief  Operational  Description 

The  fractional  part  of  the  multiplier  is  loaded  into 
the  UQ-Register  from  the  V-BUS.   The  fractional  part  of  the 
multiplicand  is  loaded  into  the  UH-Register  from  the  V-BUS  and 
then  forwarded  to  the  M-Register.   Both  fractions  are  7  bytes 
(56  bits)  long.   The  low-order  byte  of  the  UQ-Register  plus  an 
additional  position   UQ.   (initially  0),  drive  the  multiplier 
recorder.   One  multiplication  loop  consists  of  the  following 
sequence  of  steps: 
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1)  Recoder  sets  up  shift  gates  and  NEG  signals.   Contents 
of  US-UM  (accumulated  result  in  signed-digit  format) 
gated  into  subtracter  cascade. 

2)  Output  of  subtracter  cascade  loaded  into  secondary 
rank  of  accumulator,  LS-LM. 

3)  Multiplier  shifted  right  8  bits.   Secondary  rank  of 
accumulator  (LS-LM )  shifted  right  8  bits  into 
primary  rank  ( US-UM). 

This  loop  is  executed  seven  times,  once  for  each  byte  of 
the  multiplier.   At  the  end  of  seven  loops,  UQ^  may  be  1.   If  so, 
then  1  times  the  multiplicand  must  be  added  to  the  partial  product. 
This  is  accomplished  during  the  assimilation  pass,  the  steps  of 
which  are  as  follows: 

1)  Turn  off  subtracters  2,  3,  k   by  setting  G2  =  G3  =  GU  =  0. 
Set  PDYU  (Propagation  Logic  Direct  to  YU  input  on 
subtracter  k) .      Set  MDY1  if  UQ65  =  1.   Set  NEGO  = 

NEG1  =  1;  other  NEG  signals  set  to  0. 

2)  Gate  US-UM  into  subtracter  cascade.   The  T  and  Z 
outputs  of  signed-digit  subtracter  1  (SDSl)  drive 
the  propagation  logic.   Meanwhile  the  Z  bits  from 
SDSl  propagate  through  SDS2  and  SDS3.   In  SDSU  the 
output  of  the  propagation  logic  and  the  Z  bits  are 
combined  in  an  exclusive  OR  to  produce  the  result  in 
a  conventional  form.   This  assimilated  result  is 
stored  in  the  LM-Register  and  then  forwarded  to  the 
UQ-Register.   The  UQ-Register  serves  as  an  input- 
output  buffer. 

The  range  of  a  normalized, non-zero  fraction, f,  is  given  by  To"  <_  f  <1 

The  product  of  two  such  fractions, f  and  f  ,  therefore  lies  in  the  range 

1 
T^Z  ^_   f  f  <  1.   A  product  may  require  a  terminal  left  shift  of 

k   bits  accompanied  by  a  reduction  of  its  exponent.   If  zeros  were 

inserted  in  the  low-order  k   bits  during  the  shift  then  the  precision 

of  the  result  would  be  impaired.   The  value  of  these  bits,  although 

actually  computed,  would  normally  be  lost  in  the  last  right  shift 

from  the  LS-LM  Registers  to  the  US-UM  Registers.   Logic  has  been 

added  which  assimilates  and  stores  the  four-digits  before  they  are 
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lost  by  shifting.   The  borr'-     a  the  assimilation  n        oupled 
into  position  6k   of  the  Propagation  Logic.   If  a  terminal  shift  is 
required  these  four  bits,  rather  than  zeros,  are  shifted  into  the 
low-order  position. 

Truncation  Error 

It  is  difficult  to  identify  a  normalized  result  while  it 
is  represented  redundantly.   For  this  reason  the  h   extra  low-order 
bits  of  the  product  are  always  assimilated  but  used  only  if  the 
full-precision  result  requires  a  left  shift  for  normalization. 
For  purposes  of  error  analysis  we  may  assume  that  60  rather  than 
56  bits  of  the  product  are  assimilated.   The  range  of  the  trunca- 
tion error,  e  ,  due  to  truncation  after  60  signed-digits  is  given 
by  -2   <e   <2    .   In  cases  in  which  no  left  shift  is  required,  the 
four  low-order  bits  of  the  assimilated  result  are  dropped  and  thus 
e   is  in  the  range  -2   <  e   <  2  '  .   If  the  left  shift  is  required 

t/-  q/T 

then  e  is  in  the  range  -2  '   <  e   <  2  '  1  This  later  range  is 
worst  case,  but  since  in  general  the  programmer  will  not  know 
whether  or  not  the  shift  has  occurred,  it  must  be  taken  as  the 
best  guaranteed  precision. 

The  entire  arithmetic  unit  has  been  simulated  using  PL/1. 
The  simulation  of  multiplication  brought  to  light  an  interesting 
property  of  the  signed-digit  representation:   the  tendency  to 
produce  rounded  results.   The  results  of  the  simulator  where 
compared  with  the  result  for  the  same  operation  performed  by  the 
arithmetic  unit  of  the  IBM/360/75.   Frequently  the  result  produced 
by  the  simulator  was  greater  than  that  of  the  360  by  2    :  a  1  in 
the  least  significant  position. 

It  was  determined  that  the  IBM/360  was  producing  the 
result  by  what  is  equivalent  to  truncating  a  double  precision 
Result  in  conventional  form.   No  rounding  was  performed.   In  the 
Illiac  III  multiplication  scheme,  since  the  signed-digits  may 
be  positive  or  negative,  either  a  positive  or  a  negative  truncation 
error  is  introduced  by  each  right  shift  in  the  multiplication  loop. 
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In  the  cases  observed  these  errors  tended  to  cancel.   The  result 

produced  is  the  same  as  would  be  produced  by  the  IBM/360  if  rounding 

-56 
occurred  at  the  position  of  weight  2  '   based  upon  the  value  of 

the  bit  to  the  right.   Subsequent  work  by  Robertson  [7],  based 

upon  work  by  Rohatsh  [2]  has  shown  that  for  the  signed-digit 

format,  the  probability  of  obtaining  a  rounded  result  is  5/6. 
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DIVISr 

Background. 

Robertson  [8 ]  has  proposed  a  class  of  division  techniques 
in  which  quotient  digits  are  selected  based  upon  an  inspection  of  only 
a  few  high-order  bits  of  the  divisor  and  partial  remainder.  The  quotient 
selection  mechanism  may  be  viewed  as  a  model  of  the  full  precision  divi- 
sion mechanism.  The  model  division  uses  truncated  versions  of  the  divi 
and  partial  remainders  to  produce  quotient  digits  which  are  in  turn  used 
in  forming  the  next  full  precision  partial  remainder.   The  division  pro- 
cedured  used  in  the  model  need  bear  no  relationship  to  a  conventional 
division  procedure,  in  particular, to  the  full  precision  procedure.  The 
procedure  for  the  model  division  of  Illiac  III  is  a  radix  h   table 
look-up.  The  nature  of  this  class  of  division  techniques  is  explored 
in  detail  in  a  paper  by  Atkins  [9 \ . 

The  model  division  determines  which  multiples  of  the  divisor 
are  to  be  subtracted  from  the  partial  remainder.  In  this  respect  it  is 
analogous  to  the  multiplier  recoder  and  may,  in  fact,  be  viewed  as  a 
quotient  recoder.  In  multiplication  the  recoder  introduces  redundancy  into 
the  representation  of  the  multiplier;  in  division  the  recoder  introduces 
redundancy  into  the  representation  of  the  quotient.  The  quotient  recoder 
is,  however  complicated  by  the  following  properties  of  the  division 
algorithm: 

1.  The  quotient  recoding  is  a  function  of  both  the  divisor 
and  the  partial  remainder. 

2.  The  partial  remainder,  unlike  the  divisor  or  the 
multiplier,  is  not  constant  throughout  the  operation. 

3.  Since  partial  remainders  are  formed  with  the  signed- 
digit  subtracters,  they  are  represented  redundantly. 

But  despite  these  complications,  the  strong  analogy  between  multiplier 
recoding  and  the  concept  of  the  model  division  leads  to  a  division 
scheme  which  is  highly  compatible  with  the  multiplication  scheme  des- 
cribed in  the  previous  section. 
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Model  Division 

As  shown  in  the  Atkins  paper  [9 ] ,  a  radix  h   division  may  be 
performed  using  a  table  look-up  on  inputs  consisting  of  the  four  high- 
order  bits  of  the  divisor  and  the  six  high-order  bits  of  the  shifted 
partial  remainder.  The  output  of  the  table  is  a  quotient  digit  value 
of  either  2,  1,  0,  1,  or  2.  In  the  most  brute  force  form  the  table  look- 
up may  be  thought  of  as  a  grid  or  matrix.  The  vertical  lines  are  outputs 
of  decoders  applied  to  d,  the  truncated  (h   bit.}  version  of  the  divisor; 
the  horizonal  lines  are  outputs  of  decoders  applied  to  rp,  the  truncated 
(6  bit)  version  of  the  shifted  partial  remainder.  At  each  intersection 
of  the  lines  is  an  AND  gate  with  one  input  connected  to  the  vertical  line 
and  the  other  connected  to  the  horizonal  line.  Each  point  of  intersection 
corresponds  to  a  quotient  digit  value,  i,  and  thus  the  output  of  each 
AND  gate  is  connected  to  an  input  of  the  OR  gate  with  output  correspond- 
ing to  the  quotient  digit,  q=i. 

The  size  of  the  table  is  constrained  by  restrictions  on 

the  range  of  the  divisors  and  partial  remainders.  The  divisor  is  normalized 

in  the  range  1/2  <d<  1.   Due  to  certain  properties  of  the  division  scheme 

(see  Ref.  [9 ] ) ,  any  partial  remainder,  say  p.,  must  be  in  the  range 

J 
| p. |  £_  2/3  d.   The  shifted  partial  remainder,  rp . ,  where  r  is  the  radix 

and  j  is  the  recursive  index,  must  be  in  the  range  |rp.|  _  8/3  d  when 

J 

r  =  h.    The  divisor  is  always  positive;  the  partial  remainder  may  be  either 

positive  or  negative  since  the  division  is  nonrestoring.  The  actual 

implementation  is  not  nearly  as  formidable  as  the  brute  force  attack 

might  imply.  This  will  be  demonstrated  using  the  actual  equations  for 

the  Illiac  III  model  division. 

It  is  prohibitively  expensive  to  apply  the  redundant  from 

of  the  partial  remainder  directly  to  a  table  look-up.  In  a  redundant 

number  system  one  algebraic  value  may  be  disguised  in  many  forms.  The 

6  digit  estimate  of  the  partial  remainder  is  therefore  assimilated  into 

a  conventional  radix  complement  form  prior  to  the  table  look-up.  The 

assimilated  version  is  of  the  form  A^An  A^.A^A,  A,_A^,  where  A_  is  the  sign. 

01234po         U 
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The  estimates  of  the  divisor  are  decoded  into  the  intervals  shown  in 
Table  4.   Since  the  divisor  is  stored  in  the  M-Register  and  since  the 
radix  point  is  between  positions  8  and  9»  the  high-order  four  bits  of 
the  divisor  are  designated  M  ,  M   ,  M   and  M   .   Note  that  since  d  is 
at  least  1/2,  M  is  always  1. 

y 


Interval 

Name    Logic  Equations Range  of  divisor,  d,  represented 

D1  **10**11**12  1/2  £  d  <  9/16 


D2 


M10M11M12  9/l6  1  d  <  5/8 


D3  ^10M11^12  5/8  £  d  <  11/16 

Bh  ^10M11M12  ll/l6  £  d  *    3/k 

D  M10^11  3/k  -   d  <  7/8 

D6  M10MX1  7/8  1  d  <  1 

D  D  v  D  v  D  1/2  <  d  <  11/16 

D0  D,  v  D,.  v  Y)r  11/16  <  d  <  1 

8  4    5    o  — 

D  D,  v  D  11/16  <_  d  <  7/8 

9  4    2 

D1Q  (D  v  D6)  =  U±±  3/4  <  d  <  1 

D1X  (Dx  v  D2  v  D3  v  Bk)    =   MX1  1/2  £  d  <  3/4 


Table   4   -   Divisor  Interval  Selection  Logic 
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The  assimilated  estimate  of  the  partial  remainder  and  the 
outputs  of  the  divisor  interval  selection  logic  are  used  to  generate 
the  logic  signals  ZERO,  ONE,  and  TWO  corresponding  to  quotient  digit 
magnitudes  of  0,  1,  and  2,  respectively.  The  signals  ZERO  and  TWO  are 
formed  as  the  OR  of  two  other  signals,  one  corresponding  to  the  quadrant 
for  positive  partial  remainders;  the  other  corresponding  to  negative 
partial  remainders.  The  signal,  ONE,  is  implemented  in  the  form  0NE= 


ZERO  v  TWO.  The  following  defines  ZERO  and  TWO: 

ZERO  =  ZEROP  v  ZERON 
TWO  =  TWOP  v  TWON 


ZEROP  =  A  A  A  A  A,  v  A  A  A  A  A,  AD  v  AJVn  AJYJ 


0  1  2  3  10 


ZERON  =  AqA^A  A^  v  AqA^A  A^A  D1Q 


TWOP  =  A0A3AuA5A6Dl  v  A^A^  v  A^A^A^ 
v  AQA2A3D8  v  A0A2A3A1+D9  v  A^A^A^ 
v  I0A2I3AUA5A6D6  v  AoAl  v  A^ 

TWON  =  AQA3AUD1  v  A^A^D,,  v  A^A^A^ 

v  A0A2A3AUA5A6D5  v  A^A^Dg  v  A^ 
v  A0A2A3AUD5  v  A0A2A3  v  A^ 

Operational  Description  of  Model  Division 

We  have  now  defined  a  radix  k   quotient  selection  mechanism 
which  is  analogous  to  the  multiplier  recoder  defined  in  Table  3-  As  with 
multiplication,  division  is  extended  to  radix  256  by  means  of  four 
successive  applications  of  radix  k   division. 

Let  the  figure  below  represent  the  high-order  byte  of  the 
US-UM  Register  and  let  :  denote  the  radix  point  for  the  full  precision 
division. 


To  Radix  h   Division 
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A  radix  k   division  with  radix  point  denoted  '  .  '  is  applied  to  the 
leading  6  positions  of  the  output  of  US-UM.  For  a  radix  256  division  of 
the  class  described  in  [9]   the  magnitude  of  the  shifted  partial  remai:. 
is  less  than  170  2/3  relative  to  the  radix  point,':1.  The  shifted 
partial  remainder  relative  to  '.'  is  therefore  less  than  8/3.  The  radix 
h   table  look-up  selects  a  quotient  digit  magnitude  of  either  0,1,  or  2. 
These  correspond  to  radix  256  digits  of  0.  64,  or  128  and  to  the  selection 
of  no  shift  gates,  shift  gate  ML7Y1 ,  or  ML6Y1.  If  neither  gate  is  selected 
the  Y  input  to  the  subtracter  is  zero.  The  selected  multiple  of  the 
divisor  is  added  in  the  first  signed-digit  subtracter  (SDS1  in  Figure  3) 
if  the  sign  of  the  partial  remainder,  A  ,  is  1;  it  is  subtracted  if 
A  is  0.  The  new  partial  remainder,  the  next  input  to  the  model  division, 
appears  at  the  output  of  SDSl.  For  the  next  radix  h   division,  rather  than 
shifting  the  partial  remainder  left  two  positions,  the  input  to  the  model 
is  shifted  right  by  two  positions.  Figure  5  summarizes  all  four  stages 
of  one  pass  through  the  subtracter  cascade. 

Figure  6  is  a  block  diagram  of  the  entire  model  division 
structure.  The  Input  Gating  is  an  AND-OR  complex  which  under  control  of 
signals  C  through  C,  gates  the  appropriate  digits  of  successive 
partial  remainders  into  the  Assimilation  box.   Before  continuing  with 
the  description  we  must  note  a  slight  complication  again  arising  from 
bogus  overflow.  In  Figure  5,  as  the  inputs  to  the  model  division  are 
moved  to  the  right,  zeros  are  shown  occupying  all  positions  to  the 
left  of  the  highest  order  input.  The  range  restrictions  on  the  shifted 
partial  remainders  are  such  that  the  positions  shown  as  zero  should 
indeed  be  zero  if  the  partial  remainders  were  not  in  a  redundant  form. 
But  due  to  bogus  overflow,  the  highest  order  digit  of  the  input  to  the 
model  may  be  1  with  a  1  in  the  position  immediately  to  the  left,  or 
vice-versa.  To  compensate  for  this  behavior  the  magnitude  of  the  digit 
immediately  to  the  left  of  the  model  input  is  monitored.  If  it  is  non- 
zero, then  the  sign  of  the  high-order  digit  into  the  model  is  complemented 
as  it  is  gated  into  the  Assimilation  box.  Note  that  the  0th  bit  of 
the  UM-Register  is  equivalent  to  the  8th  position  of  the  LM-Register. 
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Shift  Gate  Selected 

for  magnitude  of 
Position  Number  quotient  digit  = 

123H56789111 
0  12 2 1 0 

°utput   of  ML7Y1     ML.6Y1       none 

US-UM ' : 

*■ ■ V ' 

To  Model 

°Utput  °f  ML5Y2  MLUY2  none 

SDS  1   0_  0_  _  _._     _:_ 

v   v / 

To  Model 

Output  of 

SDS  2   0000 . : M13Y3  ML2Y3  none 

»   v       f 

To  Model 
Output  of 

SDS   3        000000 : ML1YU     MDY^        none 

1 v > 

To  Model 

Note:   The  symbol  .  represents  the  radix  point  for  the  radix  k   model 
division.   The  symbol  :  represents  the  radix  point  for  the 
full  precision  division,  radix  256. 

SETTING  OF  NEG  SIGNALS: 

Division  Stage  No.         Positive  Partial       Negative  Partial 

Remainder  Remainder 

(A  =0)              (A  =  1) 
o o 

1  NEG0  =  NEG1  =  1  NEG0  =  NEG1  =  0 

2  NEG1  =  NEG3  =  1  NEG1  =  NEG2  =  0 

3  NEG2  =  NEG3  =  1  NEG2  =  NEG3  =  0 
h  NEG3  =  NEGU  =  1  NEG3  =  NEGU  =  0 

Figure  5  -  Connection  of  Model  Division  to  Full  Precision 
Structure 
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The  Assimilation  "box  produces  a  two's  complement  version 
of  the  estimate  of  the  shifted  partial  remainder.   This  together 
with  the  Division  Interval  Select  Logic  drives  the  Quotient  Select 
Table.   The  quotient  digits  are  represented  in  the  same  signed 
digit  format  as  produced  by  the  subtracters.   The  following  gives  the 
signed  digit  representation  of  each  quotient  digit  value: 

Quotient  Digit  Representation 


+2 

+1 
0 


1 

0 

0 

l 

0 

0 

0 

0 

0 

I 

T 

0 

Note  that  a  distinction  is  made  between  a  positive  and 
negative  zero.   The  sign  of  all  digits,  including  zero,  is  the 
same  as  the  sign  of  the  partial  remainder.   If  the  digit  0  is  formed 
then  zero  is  subtracted  from  the  partial  remainder.   If  the  digit 
0  is  formed  then  zero  is  added  to  the  partial  remainder.   As  shown 
in  the  proof  in  the  Appendix  this  method  of  handling  a  zero  quotient 
digit  eliminates  bogus  overflow  at  position  1  for  division. 

The  quotient  digits  are  buffered  until  eight  are  collected. 
They  are  then  gated  to  the  low-order  byte  of  the  UH-UQ  Register. 
The  quotient  digit  also  setup  the  shift  gates  and  NEG  signal  in 
accordance  with  description  in  Figure  6.   The  operating  time  of  the 
model  is  summarized  in  Table  5. 
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Block  No.  of  Collector  Delay: 

Input  Gating  2 

Assimilation  3 

Quotient  Selection  2 

Quotient  Storage  and 
Shift  Control  3 


Total  10 


Table  5  -  Operating  Times  of  the  Model  Division 


It  should  be  emphasized  that  the  scheme  used  in  the 
model  division  is  but  one  of  many  possibilities.   Since  the  amount 
of  logic  involved  is  quite  small  (10  cards),  and  has  a  well 
defined  interface  and  is  physically  one  package,  it  is  quite 
feasible  to  replace  the  model  with  new,  hopefully  improved  versions. 
The  operating  time  for  division  relative  to  the  operating  time 
for  multiplication  is  primarily  a  function  of  the  relative  operating 
times  of  the  multiplier  recoder  and  model  division.   The  concept 
of  a  i„odel  division  and  the  analogy  to  the  multiplier  recoder 
offers  several  interesting  areas  of  research,  some  of  which  are 
being  explored  by  the  author  in  Ph.D.  thesis  research. 

Division  Structure 


As  mentioned  earlier,  a  primary  motivation  for  use  of 
the  model  division  approach  is  its  high  compatibility  with 
multiplication.   The  division  structure  is  the  same  as  the 
multiplication  structure  described  in  conjunction  with  Figure  3. 
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Brief  Operational  Description  of  Full  Precision  Division  Scheme 

The  fractional  part  of  dividend  is  loaded  into  the  UQ- 
Register  from  the  V-Bus .   The  fractional  part  of  the  division  is 
loaded  into  the  UH_Register  from  the  V-Bus.   Both  fractions  are 
7  bytes  (56  hits)  long.   The  range  of  a  normalized  fraction,  f, 
is  given  by  l/l6  <_  f  <  1.   The  model  division  scheme  requires 
that  the  division,  d,  be  in  the  range  1/2  <_  d  <  1.  If  the  given 
divisor  is  not  in  this  range  then  both  the  divisor  and  dividend 
are  shifted  left  until  it  is  in  range.   After  normalization,  the 
divisor  is  forwarded  to  the  M-Register  and  the  dividend  is  for- 
warded to  the  UM-Register.   The  US-Register  is  cleared,  i.e.  all 
sign  bits  are  set  to  0.   One  division  loop  consists  of  the 
following  sequence  of  steps: 

1)  The  contents  of  US-UM  (dividend)  is  gated  into 
the  subtracter  cascade.   The  model  division 
successively  sets  up  the  shift  gates  and  NEG  signals 
in  accordance  with  the  previous  description. 

2)  The  output  of  subtracter  cascade  is  loaded  into 
secondary  rank  of  accumulator,  LS-LM. 

3)  The  quotient  (sign  bits  in  UH,  magnitude  bits  in  UQ) 
is  shift  left  8  bits  and  the  8  digits  buffered  in 
the  model  division  are  inserted  into  the  low-order 
byte  of  UH-UQ.   The  secondary  rank  of  the  accumulator 
( LS-LM)  is  shifted  left  8  bits  into  the  primary 

rank  (US-UM). 

Due  to  the  initial  normalization  of  the  divisor  and 
corresponding  shifting  of  the  dividend,  the  dividend  may  extend 
across  8  bytes.   The  division  loop  must  therefore  be  executed  8 
times.   After  the  last  loop  the  quotient  in  the  UH  and  UQ  Registers 
is  transferred  to  the  US  and  UM  Registers,  respectively.   The 
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quotient  is  then  assimilated  in  the  same  manner  as  described  in 
the  brief  operational  description  of  multiplication. 

The  range  of  the  quotient  for  the  division  of  two  non- 
zero fractions  F  and  F  is  given  by  1/16  <  f  /f  <  l6  .  A  quotient 
may  therefore  require  a  terminal  right  shift  of  U  bits  accompanied 
by  an  increase  of  the  exponent.   Division  by  zero  or  into  zero  is 
detected  during  preliminary  steps  of  the  division  operation. 

Truncation  Error 

The  range  of  the  truncation  error,  e  ,  due  to  truncation 

after  56  signed  digits  is  given  by  -2  ''  <  e  <  2  '  .  If  a  terminal 

right  shift  of  the  assimilated  result  takes  place  e  is  brought 

into  the  range  -2    <  e  <  2  '  ,  however,  the  first  range  is  the 
best  case  that  can  be  guaranteed. 
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APPENDIX 


The  Appendix  includes  the  proof  of  the  validity  of  the  bogus 
overflow  correction  scheme  and  an  introduction  to  the  entire  Illiac  III 
system. 
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PROOF  OF  THE  VALIDITY  OF  THE  BOGUS 
OVERFLOW  CORRECTION  SCHEME 


We  are  concerned  with  the  value  of  the  output  of  positions  0  and  1 
of  the  subtracter.   Inspection  of  the  equations  for  the  subtracter  defined  in 
Figure  3  reveals  that  the  value  of  these  outputs,  Z*  and  Z*  are  functions  only 
of  the  inputs  to  positions  0,  1  and  2.   Since  the  0th  position  is  not  imple- 
mented S   and  X  are  implicitly  both  zero.   Furthermore  since  the  subtrahend 
is  always  considered  to  be  positive  and  can  never  be  greater  than  128,  Y  and 
Y  are  also  both  zero.   Table  A-l  enumerates  Z*  and  Z*  as  functions  of  X*,X* 
and  Y*.   Recall  the  notational  convention  defined  below: 


T. 

l 

z. 

1 

z* 

1 

0 

0 

0 

0 

1 

1 

1 

0 

~0~ 

1 

1 

1 

A  0  digit  under  the  X*  or  X*  columns  may  be  either  a  positive  or 
negative  zero.   The  table  is  defined  for  NEG  =0.   If  NEG  were  to  be  1 ,  the 
signs  of  all  output  digits  would  be  complemented  but  magnitudes  and  thus 
bogus  overflow  conditions  would  be  the  same.   It  is  therefore  sufficient  to 
complete  the  proof  for  NEG  =  0;  the  proof  for  NEG  =  1  follows  immediately  by 
symmetry. 

For  all  the  cases  in  Table  A-l  for  which  Z*  is  zero  no  problem 
arises.   Note  that  Z*  is  non-zero  if  and  only  if  X*  =  1,  in  other  words , when 
S  X  =  1.   For  the  cases  marked  with  *  the  bogus  overflow  scheme  is  valid. 
For  those  marked  with  #  the  scheme  is  not  valid  but  we  shall  show  that  within 
the  constraints  of  the  Illiac  III  implementation  these  cases  cannot  occur. 
The  proof  is  considered  for  the  three  classes  of  operations  in  which  the 
subtracter  cascade  is  used,  namely,  addition-subtraction,  division  and 
multiplication. 
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Entries  in  Table  are  Z*  (weight  256)  and  Z*  (weight  128 ] 

(a)  (b) 


Row  No. 


xl* 

x* 

Y2  =  0 

Y2=l 

(128) 

(6k)      . 

(6k) 

(6k) 

0 

0 

00 

01 

0 

l 

00 

00 

0 

I 

01 

01 

1 

0 

01 

00 

1 

0 

11* 

io# 

1 

1 

01 

01 

1 

1 

00 

oo 

1 

1 

11* 

11* 

1 

1 

10# 

10# 

Notes:   SQ  =  XQ  =  YQ  =  Y±   =  0 

NEG  =  0 

A  0  digit  may  be  positive  or  negative.   Numbers  in  parentheses 
indicate  the  weight  of  the  digital  position. 

*Indicates  bogus  overflow  which  will  be  corrected  by  comple- 
menting the  sign  of  Z*  and  disposing  of  Z*. 

^Indicates  cases  in  which  this  correction  scheme  is  not  valid. 


TABLE  A-l  -  Possible  Values  of  Z*  and  Z* 
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Addition  -  Subtraction 

The  radix  point  for  floating  point  operations  is  between  posi- 
tions 8  and  9  of  the  subtracters.  All  operands  are  less  than  1,  therefore 
X*  =  X*  =  0  and  Y  =  0.   We  are  therefore  restricted  to  row  1,  column  a 
(denoted  1-a)  of  the  table.  Bogus  overflow  is  therefore  avoided. 


Division 

We  have  shown  that  bogus  overflow  arises  if  and  only  if  X*  =  1. 
For  the  case  of  division  this  is  also  the  sign  of  the  partial  remainder 
and  since  the  partial  remainder  is  negative  it  is  added  to  the  selected 
multiple  of  the  division.   Addition  requires  that  X*  be  complemented  prior 
to  entry  into  the  subtracter  and  thus  X*  becomes  1.   In  division  the  sub- 
tracters always  see  only  positive  inputs  and  therefore  the  states  in  rows 
5,  8  and  9  cannot  occur.   Bogus  overflow  is  avoided  altogether. 


Multiplication 

The  multiplicand,  M,  is  stored  in  the  M-Register  and  is  in  the 
range  l/l6  to  1.   The  maximum  multiple  of  M  which  may  be  formed  is  128 
times  M  (at  the  first  subtracter)  and  thus  Y  can  be  equal  1  only  at  the 
first  subtracter.   The  contents  of  the  accumulator  (US-UM),  the  signed- 
digit  input  to  the  first  subtracter,  is  always  less  than  1  in  magnitude 
and  therefore  X*  =  X*  =  0.   Thus  all  entries  in  column  b  except  1-b  are 
eliminated  as  possibilities.   The  remaining  task  is  to  show  that  case  9-a 
cannot  occur. 

At  this  point  we  must  note  a  property  of  the  multiplier  recoding 
scheme  defined  in  Table  3.   This  property  is  that  128  is  the  maximum  multiple 
of  the  multiplicand  which  may  be  combined  with  the  partial  product  in  any  one 
pass  through  the  subtracter  cascade.   This  may  be  established  by  considering 
a  group  of  nine  bits  which  are  to  be  recoded.   If  +2  or  »2  is  selected  as 
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the  recoded  version  of  the  leftmost  trio  of  bits,  then  all  recoded  digits 
to  the  right  are  either  zero  or  of  opposite  sign.   Recall  from  Figure  k 
that  the  selection  of  2  at  the  left  of  the  recoding  logic  generates  a  7 
bit  left  shift  of  M  into  the  first  subtracter. 

Having  ruled  out  cases  5-b  and  9-b ,  we  may  state  that  the  10 
condition  in  9-a  occurs  if  and  only  if  X*  =  X*  =  1.   This  means  that  the 
algebraic  value  of  the  signed-digit  input  to  the  subtracter  is  more  negative 
than  -128.   This  clearly  cannot  be  the  case  unless  a  multiple  of  128  x  M 
has  been  combined  with  a  non-zero  partial  product  of  the  same  sign  as  the 
multiple.   This  may  occur  only  in  the  first  subtracter.   If  case  1-a 
occurs,  the  partial  product  is  less  than  128  in  magnitude  and  thus  cannot 
become  more  negative  than  -128  by  subsequent  operations  in  the  subtracter 
cascade. 

Case  1-b  can  occur  only  if  a  multiple  of  128  is  selected  and  thus 
the  subsequent  subtracters  can  only  either  preserve  the  value  of  the  partial 
product  by  subtraction  of  zeros  or  decrease  it  in  magnitude.   If  the  mag- 
nitude is  decreased,  it  will  be  decreased  to  less  than  128  and  thus  case  9_a 
is  immediately  ruled  out.   If  subtraction  or  additon  of  zero  occurs  at  sub- 
sequent subtracters  the  01  pattern  in  1-b  will  propagate  through  and  cannot 
become  10.   This  is  demonstrated  by  the  following  reasoning.   For  case  1-b 
with  X*  =  0,  Z*  is  never  1.   Z*  and  Z*  are  the  X*,  X*  inputs  to  the  next 
subtracter  and  thus  we  are  brought  to  either  row,  but  will  be  corrected  back 
to  either  the  case  Z*  =  1 ,  Z*  =  1  or  to  the  case  Z*  =  1,  Z*  =  0.   The  01 
pattern  of  case  1-b  will  therefore  pass  through  all  of  the  subtracters  and 
will  never  be  reformed  as  11  in  9-a. 
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BRIEF  DESCRIPTION  OF  THE  ILLIAC  III  COMPUTER  SYS'-' 


The  Illinois  Pattern  Recognition  Computer,  Illiac  III, 
is  a  digital  processor  for  visual  information.   It  is  primarily 
designed  for  automatic  scanning  and  analysis  of  massive  amounts  of 
relatively  homogeneous  visual  data.   In  particular  the  design  is 
an  outgrowth  of  studies  at  this  laboratory  of  a  computer  system 
capable  of  scanning,  measuring  and  analyzing  in  excess  of  3  x  10 
"bubble  chamber  negatives  per  year. 

Illiac  III,  though  specifically  designed  to  process  visual 
information,  also  provides  complete  facilities  for  standard  general- 
purpose  computation.   Both  the  picture  processing  and  general- 
purpose  computation  facilities  of  Illiac  III  will  be  available  to 
users  on  a  time-sharing  basis. 

As  can  be  seen  in  Figure  A-l,  Illiac  III  is  a  multi- 
processor computer  system.   Six  processors  (U  Taxicrinic  Processors 
and  2  Input/Output  Processors)  access  in  parallel  the  computational/ 
storage  units  consisting  of  2  Arithmetic  Units,  1  Interrupt  Unit, 
1  Pattern  Articulation  Unit,  and  k   Storage  Units.   Each  computational/ 
storage  unit  of  the  computer  system  specializes  in  a  particular 
activity.   Thus,  for  example,  all  floating-point  computation  is 
done  in  the  Arithmetic  Units,  while  picture  processing  is  performed 
primarily  by  the  Pattern  Articulation  Unit.   Processors,  on  the 
otherhand,  analyze  user  jobs  and  route  their  constituent  tasks 
to  the  appropriate  specialized  processing  units.   The  individual 
processors  of  the  system  can  operate  simultaneously  and  independently 
(within  the  limits  imposed  by  the  System  Supervisor)  with  a  consequent 
increase  in  overall  efficiency. 

The  Input/Output  Processors  (I0P)  are  attached  via  Channel 
Interface  Units  and  Device  Controllers  to  various  input  and  output 
devices.   Among  facilities  important  for  the  ingestion  of  visual 
information  are  8  CRT  flying  spot  scanners:   two  for  70  mm  film, 
two  for  h6   mm  film,  two  for  microfilm/microfiche,  and  two  for 

* 

From  Section  1  of  the  Illiac  III  Programming  Manual. 
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Figure  A-l      Schematic   of  Illiac   III   Computer 
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microscope  slides.   These  scanners  c i.n  also  operate  as  film 
cameras  and  thus  serve  as  both  input  and  output  devices.   Monit-. 
stations  have  also  been  attached  to  the  Input/Output  syst' 
These  each  consist  of  a  CRT  display,  a  typewriter,  and  a  magnetic 
tape  unit;  and  are  provided  to  assist  human  control  of  the  analysis. 

The  duty  of  the  Pattern  Articulation  Unit  (PAU)  is  to 
perform  local  preprocessing  on  the  input  from  the  scanners,  such 
as  track  thinning,  gap  filling,  line  element  recognition,  etc. 
The  logical  design  of  this  all-digital  processor  has  been  optimized 
for  the  idealization  of  the  input  image  to  a  line  drawing.   Nodes 
representing  end  points,  points  of  inflection,  points  of  inter- 
section, etc.  are  labeled  in  parallel  by  appropriate  programming 
under  overall  control  of  the  Taxicrinic  Processor.   The  abstract 
graph  describing  the  interconnection  of  labelled  nodes  is  then 
extracted  as  a  list  structure,  which  comprises  the  normal  output  of 
the  Pattern  Articulation  Unit. 

This  output  is  then  operated  on  by  a  Taxicrinic  Processor 
(TP),  which  assembles  such  graphs  into  coherent  list  structures 
subject  to  a  recognition  grammar  and  then  syntactically  categorizes 
them  to  complete  the  visual  recognition  process.   The  Taxicrinic 
Processors  are  primarily  responsible  for  the  execution  of  user 
programs,  that  is,  to  oversee  the  operations  of  the  Pattern 
Articulation  Unit,  the  Arithmetic  Unit  and  to  initiate  input/ 
output  operations  in  the  IOP's  by  making  requests  to  the  Interrupt 
Unit. 

The  Arithmetic  Unit  (AU)  is  used  exclusively  for  performing 
arithmetic  operations  for  the  TP.   Although  there  are  a  few  simple 
arithmetic  operations  which  can  be  done  in  a  TP  (e.g. ,  integer 
addition)  the  more  complicated  operations  are  done  in  the  AU. 
The  AU  has  been  optimized  for  double-word  floating  point  arithmetic. 

The  Interrupt  Unit  (IU)  handles  all  the  interrupt  requests 
from  the  TP  and  IOP.  When  an  interrupt  is  requested  it  notifies 
the  proper  processors  which  then  take  appropriate  action. 
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All  of  the  Illiac  III  processors  and  units  communicate 
with  each  other  through  the  Exchange  Net  (XN)  as  shown  in  Figure 
A-l.   The  Exchange  Net  is  responsible  for  all  the  necessary 
queueing  and  priority  checking. 

As  noted  above,  there  is  indeed  a  reason  for  calling 
one  piece  of  equipment  a  processor  and  another  a  unit,  even  though 
the  type  of  operations  they  perform  may  both  appear  to  be  "processing" 
operations.   In  the  Illiac  III  system  all  major  modules  are 
designated  as  either  "processors"  or  "units"  according  to  their 
position  in  the  Exchange  Net.   In  Figure  A-l,  the  processors 
are  shown  at  the  top  and  bottom  and  the  units  are  shown  on  the 
right.   The  effect  of  this  division  is  that  processors  may  communicate 
directly  with  units  and  vice  versa  but  may  not  communicate  directly 
with  each  other.   If  a  processor  needs  to  communicate  with  another 
processor  it  must  get  help  from  a  unit  (normally  the  Interrupt 
Unit)  and  if  a  unit  (say  the  PAU)  wants  to  communicate  with  another 
unit  (say  a  storage  unit)  the  information  must  be  transferred 
through  a  processor  (the  TP  in  this  case). 
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Signature 


fcK&O&ti-^ 


Date 


May  28,   19 69 


FOR   AEC   USE   ONLY 

AEC  CONTRACT  ADMINISTRATOR'S  COMMENTS,   IF   ANY,  ON    ABOVE    A 
RECOMMENDATION: 


NNOUNCEMENT  AND   DISTRIBUTION 


PATENT   CLEARANCE: 

D  a.   AEC  patent  clearance  has  been  granted  by  responsible  AEC  patent  group. 
U  b.   Report  has  been  sent  to  responsible  AEC  patent  group  for  clearance. 
LJ  c.   Patent  clearance  not  required. 
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